171 research outputs found

    Nearly optimal Bayesian Shrinkage for High Dimensional Regression

    Full text link
    During the past decade, shrinkage priors have received much attention in Bayesian analysis of high-dimensional data. In this paper, we study the problem for high-dimensional linear regression models. We show that if the shrinkage prior has a heavy and flat tail, and allocates a sufficiently large probability mass in a very small neighborhood of zero, then its posterior properties are as good as those of the spike-and-slab prior. While enjoying its efficiency in Bayesian computation, the shrinkage prior can lead to a nearly optimal contraction rate and selection consistency as the spike-and-slab prior. Our numerical results show that under posterior consistency, Bayesian methods can yield much better results in variable selection than the regularization methods, such as Lasso and SCAD. We also establish a Bernstein von-Mises type results comparable to Castillo et al (2015), this result leads to a convenient way to quantify uncertainties of the regression coefficient estimates, which has been beyond the ability of regularization methods

    Improving SAMC using smoothing methods: Theory and applications to Bayesian model selection problems

    Get PDF
    Stochastic approximation Monte Carlo (SAMC) has recently been proposed by Liang, Liu and Carroll [J. Amer. Statist. Assoc. 102 (2007) 305--320] as a general simulation and optimization algorithm. In this paper, we propose to improve its convergence using smoothing methods and discuss the application of the new algorithm to Bayesian model selection problems. The new algorithm is tested through a change-point identification example. The numerical results indicate that the new algorithm can outperform SAMC and reversible jump MCMC significantly for the model selection problems. The new algorithm represents a general form of the stochastic approximation Markov chain Monte Carlo algorithm. It allows multiple samples to be generated at each iteration, and a bias term to be included in the parameter updating step. A rigorous proof for the convergence of the general algorithm is established under verifiable conditions. This paper also provides a framework on how to improve efficiency of Monte Carlo simulations by incorporating some nonparametric techniques.Comment: Published in at http://dx.doi.org/10.1214/07-AOS577 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Annealing Stochastic Approximation Monte Carlo for Global Optimization

    Get PDF

    A Double Regression Method for Graphical Modeling of High-dimensional Nonlinear and Non-Gaussian Data

    Full text link
    Graphical models have long been studied in statistics as a tool for inferring conditional independence relationships among a large set of random variables. The most existing works in graphical modeling focus on the cases that the data are Gaussian or mixed and the variables are linearly dependent. In this paper, we propose a double regression method for learning graphical models under the high-dimensional nonlinear and non-Gaussian setting, and prove that the proposed method is consistent under mild conditions. The proposed method works by performing a series of nonparametric conditional independence tests. The conditioning set of each test is reduced via a double regression procedure where a model-free sure independence screening procedure or a sparse deep neural network can be employed. The numerical results indicate that the proposed method works well for high-dimensional nonlinear and non-Gaussian data.Comment: 1 figur

    Nonlinear Sufficient Dimension Reduction with a Stochastic Neural Network

    Full text link
    Sufficient dimension reduction is a powerful tool to extract core information hidden in the high-dimensional data and has potentially many important applications in machine learning tasks. However, the existing nonlinear sufficient dimension reduction methods often lack the scalability necessary for dealing with large-scale data. We propose a new type of stochastic neural network under a rigorous probabilistic framework and show that it can be used for sufficient dimension reduction for large-scale data. The proposed stochastic neural network is trained using an adaptive stochastic gradient Markov chain Monte Carlo algorithm, whose convergence is rigorously studied in the paper as well. Through extensive experiments on real-world classification and regression problems, we show that the proposed method compares favorably with the existing state-of-the-art sufficient dimension reduction methods and is computationally more efficient for large-scale data

    Bayesian Peak Picking for NMR Spectra

    Get PDF
    AbstractProtein structure determination is a very important topic in structural genomics, which helps people to understand varieties of biological functions such as protein-protein interactions, protein–DNA interactions and so on. Nowadays, nuclear magnetic resonance (NMR) has often been used to determine the three-dimensional structures of protein in vivo. This study aims to automate the peak picking step, the most important and tricky step in NMR structure determination. We propose to model the NMR spectrum by a mixture of bivariate Gaussian densities and use the stochastic approximation Monte Carlo algorithm as the computational tool to solve the problem. Under the Bayesian framework, the peak picking problem is casted as a variable selection problem. The proposed method can automatically distinguish true peaks from false ones without preprocessing the data. To the best of our knowledge, this is the first effort in the literature that tackles the peak picking problem for NMR spectrum data using Bayesian method
    • …
    corecore